Skip to content

Fix torchvision compatibility check for source builds and future torch versions#3978

Merged
danielhanchen merged 6 commits into
mainfrom
fix-torchvision-compat-check
Feb 4, 2026
Merged

Fix torchvision compatibility check for source builds and future torch versions#3978
danielhanchen merged 6 commits into
mainfrom
fix-torchvision-compat-check

Conversation

@danielhanchen
Copy link
Copy Markdown
Member

@danielhanchen danielhanchen commented Feb 4, 2026

Summary

  • On AMD machines with source-built PyTorch (torch==2.7.0+gitf717b2a, torchvision==0.21.0+7af6987), torchvision_compatibility_check() raised a hard ImportError that blocked from unsloth import FastLanguageModel entirely, even though the build was functional.
  • The hardcoded compatibility table also silently skipped any torch version not already listed, giving zero feedback for future releases like torch 2.10+.

Changes

Custom/source build detection -- Detects source builds by checking the raw version string's local identifier (the part after +) against known standard prefixes (cu\d, rocm\d, cpu, xpu). This must operate on the raw string from importlib_version() because our custom Version() wrapper strips local identifiers via regex before parsing. The regex requires cu/rocm to be followed by a digit (avoiding false negatives on suffixes like +custom_build), is case-insensitive (handles +ROCM6.3), and matches cpu/xpu as exact strings.

Pre-release/nightly detection -- Detects .dev, a0, b0, rc, alpha, beta, nightly tags in the raw torch version string. Nightly/dev/rc builds with standard CUDA or ROCm suffixes (e.g. 2.7.0.dev20250301+cu124) now produce a warning instead of a hard ImportError, since these builds commonly have version mismatches that are expected during development.

Formula-based forward compatibility -- The torch-to-torchvision minor version mapping follows a consistent formula that has held for every release from torch 1.7 through 2.9:

  • torch 1.x -> torchvision 0.(x + 1) (verified: 1.7 through 1.13)
  • torch 2.x -> torchvision 0.(x + 15) (verified: 2.0 through 2.9)

For versions not in the known table, the formula is used as a fallback. Mismatches from formula-predicted versions produce a warning rather than a hard error, since the formula could in theory change.

Graceful degradation -- Wraps importlib_version() and Version() calls in try/except so broken package metadata never crashes the import.

Environment variable override -- UNSLOTH_SKIP_TORCHVISION_CHECK=1 skips the check entirely for environments where the user knows the build is correct.

Behavior matrix:

Scenario Old behavior New behavior
Standard release mismatch (e.g. +cu124) ImportError ImportError (unchanged)
Source/custom build mismatch (e.g. +git*) ImportError Warning only
Nightly/dev/rc build mismatch (e.g. .dev*+cu124) ImportError Warning only
Future torch version, correct torchvision Silent skip Logs compatible
Future torch version, wrong torchvision Silent skip Warning
UNSLOTH_SKIP_TORCHVISION_CHECK=1 N/A Skips check

Validated with 384 standard-table backwards-compat tests (100% match with old behavior), plus 50+ additional tests covering custom builds, pre-releases, forward compat, edge cases, and adversarial inputs.

Test plan

  • Verify standard CUDA/ROCm release mismatches still raise ImportError (384 tests, 100% match)
  • Verify source-built versions (+gitXXX, +HEXHASH) produce a warning instead of ImportError
  • Verify nightly/dev/rc builds with standard suffixes produce a warning instead of ImportError
  • Verify formula correctly predicts torchvision requirements for future torch 2.10+
  • Verify UNSLOTH_SKIP_TORCHVISION_CHECK=1 bypasses the check
  • Verify all other version checks in import_fixes.py (xformers, vllm, datasets, fbgemm_gpu, etc.) are unaffected
  • Verify regex handles case insensitivity (+ROCM6.3, +CPU)
  • Verify regex precision (no false negatives on +custom_build, +cust, etc.)

…h versions

The torchvision version check raised a hard ImportError for custom/source-built
PyTorch installations (e.g. AMD ROCm from source with +git* suffixes), even when
the actual build was functional. This also silently skipped any torch version
not already in the hardcoded table, giving no warning at all for future releases.

Changes:
- Detect custom/source builds by checking the raw version string's local
  identifier against known standard prefixes (cu, rocm, cpu, xpu). Our custom
  Version() strips local identifiers via regex, so detection must happen on the
  raw string before parsing.
- Downgrade to a warning (instead of ImportError) for custom/source builds,
  since their version numbers may not follow standard PyPI release pairings.
- Add formula-based inference for future torch versions not yet in the table.
  The torch->torchvision minor version formula (torch 2.x -> tv 0.(x+15)) has
  held for every release from torch 2.0 through 2.9. For formula-predicted
  versions, mismatches produce a warning rather than a hard error.
- Add UNSLOTH_SKIP_TORCHVISION_CHECK=1 env var to skip the check entirely.
- Wrap importlib_version and Version calls in try/except so broken metadata
  never crashes the import.
@gemini-code-assist
Copy link
Copy Markdown
Contributor

Summary of Changes

Hello @danielhanchen, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the torchvision_compatibility_check function by making it more resilient and user-friendly. It resolves issues where source-built PyTorch environments would encounter blocking import errors and ensures that future PyTorch versions are handled gracefully through a predictive formula. The changes also introduce error handling for malformed version strings and provide an explicit override mechanism, collectively improving the reliability and flexibility of dependency checks within the library.

Highlights

  • Custom/Source Build Handling: Implemented _is_custom_torch_build to correctly identify source-built PyTorch/Torchvision versions, preventing ImportError for functional custom builds and instead issuing a warning.
  • Forward Compatibility with Formula: Introduced _infer_required_torchvision to predict compatible Torchvision versions for future PyTorch releases using a consistent formula, ensuring graceful handling of new versions not yet in the hardcoded table.
  • Robust Version Parsing: Added try/except blocks around importlib_version and Version calls to prevent crashes if package metadata is malformed or unparseable.
  • User Override: Enabled skipping the entire compatibility check via the UNSLOTH_SKIP_TORCHVISION_CHECK environment variable, providing flexibility for advanced users.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • unsloth/import_fixes.py
    • Added _is_custom_torch_build function to detect custom or source builds of PyTorch/Torchvision by inspecting the raw version string's local identifier.
    • Added _infer_required_torchvision function to programmatically determine the expected torchvision minor version based on the torch major and minor versions, using a consistent formula.
    • Modified torchvision_compatibility_check to include an environment variable check (UNSLOTH_SKIP_TORCHVISION_CHECK) to bypass the compatibility logic.
    • Wrapped calls to importlib_version and Version in try/except blocks to prevent crashes from invalid version strings.
    • Refactored the TORCH_TORCHVISION_COMPAT table from a list of strings to a dictionary of tuples for better readability and direct lookup.
    • Implemented logic to prioritize the known compatibility table and then fall back to the formula-based inference for newer PyTorch versions.
    • Changed the behavior for custom/source builds or formula-predicted mismatches from raising a hard ImportError to logging a warning, improving user experience in these specific scenarios.
    • Updated logging messages to include raw version strings for clarity.
Activity
  • No specific activity has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the torchvision compatibility check by introducing detection for custom/source builds and adding forward compatibility for future torch versions using a formula-based approach. Instead of raising a hard ImportError for these cases, it now issues a warning, making the library more robust. It also adds an environment variable to skip the check entirely. My review focuses on improving the robustness of the new detection logic. I've identified a potential bug in the custom build detection regex and a minor improvement for handling the environment variable. Overall, these are great changes that will improve the user experience.

Comment thread unsloth/import_fixes.py Outdated
local = raw_version_str.split("+", 1)[1]
if not local:
return False
return not re.match(r"(cu|rocm|cpu|xpu)", local)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current regex r"(cu|rocm|cpu|xpu)" with re.match is too broad and can incorrectly classify custom builds as standard ones. For example, a version with a local identifier like +cu121_custom would be considered a standard build because re.match would successfully match the prefix "cu". This could lead to unexpected ImportErrors instead of warnings for such custom builds, which undermines a key goal of this PR.

To ensure that only standard local identifiers are matched, you should use re.fullmatch to match the entire string and a more specific regex.

Suggested change
return not re.match(r"(cu|rocm|cpu|xpu)", local)
return not re.fullmatch(r"(?:cu\d[\d.]*|rocm\d[\d.]*|cpu|xpu)", local)

Comment thread unsloth/import_fixes.py Outdated

def torchvision_compatibility_check():
# Allow skipping via environment variable for custom environments
if os.environ.get("UNSLOTH_SKIP_TORCHVISION_CHECK", "0") in ("1", "true", "True"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The check for the environment variable UNSLOTH_SKIP_TORCHVISION_CHECK is case-sensitive for "true". To make it more robust and align with common practices for boolean environment variables, consider converting the value to lowercase before the check.

Suggested change
if os.environ.get("UNSLOTH_SKIP_TORCHVISION_CHECK", "0") in ("1", "true", "True"):
if os.environ.get("UNSLOTH_SKIP_TORCHVISION_CHECK", "0").lower() in ("1", "true"):

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 99a06160fd

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread unsloth/import_fixes.py
Comment on lines +571 to +575
# Extract major.minor from the parsed version
torch_release = torch_v.release
if len(torch_release) < 2:
return
torch_major, torch_minor = torch_release[0], torch_release[1]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Allow pre-release/nightly pairs without false mismatch

This logic derives the required torchvision version purely from torch_v.release (e.g., 2.8.0.dev20240301 becomes major/minor 2,8) and later compares against a final release string like 0.23.0. For nightly/dev builds, torchvision is typically 0.23.0.dev…, which Version() ranks below 0.23.0, so the check will raise ImportError even when the matching nightly pair is installed. This is a regression for standard nightly builds with local tags like +cu121 (not treated as custom). Consider detecting pre-releases (torch_v.is_prerelease or tv_v.is_prerelease) and downgrading to a warning or skipping the strict >= comparison for nightly pairs.

Useful? React with 👍 / 👎.

danielhanchen and others added 3 commits February 4, 2026 12:29
…tion

Fixes three edge cases found during review:

1. Regex precision: cu/xpu now require a trailing digit (cu\d, xpu\d) to
   avoid false negatives on suffixes like "+custom_build" that happen to
   start with "cu". cpu/xpu match as exact strings only.

2. Case insensitivity: added re.IGNORECASE so "+ROCM6.3" and "+CPU" are
   correctly recognized as standard builds rather than custom ones.

3. Pre-release detection: nightly/dev/alpha/beta/rc builds with standard
   CUDA/ROCm suffixes (e.g. "2.7.0.dev20250301+cu124") now produce a
   warning instead of a hard ImportError. These builds commonly have
   version mismatches that are expected during development.
…e-release

1. Switch re.match to re.fullmatch for the custom build regex so the
   entire local identifier must match. Fixes false negatives where
   suffixes like +cu124_custom were misclassified as standard because
   re.match only checked the start of the string.

2. Use .lower() for the UNSLOTH_SKIP_TORCHVISION_CHECK env var so
   any casing of "true" / "TRUE" / etc. is accepted.

3. Check torchvision_version_raw for pre-release tags in addition to
   torch_version_raw, so a stable torch paired with a nightly
   torchvision (e.g. 0.23.0.dev...) also gets a warning instead of
   a hard ImportError.
@danielhanchen
Copy link
Copy Markdown
Member Author

Addressed all three review comments in 133deda:

Regex precision (gemini): Switched from re.match to re.fullmatch with r"cu\d[\d.]*|rocm\d[\d.]*|cpu|xpu" so the entire local identifier must match. This fixes false negatives like +cu124_custom or +rocm6.3_custom that were incorrectly passing because re.match only checked the prefix.

Env var casing (gemini): Changed to .lower() in ("1", "true") so any casing of TRUE/True/true is accepted.

Pre-release detection in torchvision (codex): Now checks both torch_version_raw and torchvision_version_raw for pre-release tags (.dev, a0, b0, rc, alpha, beta, nightly). This handles the case where stable torch is paired with a nightly torchvision.

384 standard-table backwards-compat tests still match old behavior exactly. All edge cases pass.

@danielhanchen
Copy link
Copy Markdown
Member Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly improves the torchvision compatibility check by adding support for source builds, pre-release versions, and future torch releases. The changes are well-structured, introducing graceful fallbacks with warnings instead of hard errors, which enhances user experience. The logic for detecting custom builds and inferring compatibility is robust. My review includes a couple of suggestions to further improve debuggability by adding logging to exception handlers, which currently fail silently. Overall, this is a high-quality contribution that makes the library more resilient.

Comment thread unsloth/import_fixes.py
Comment on lines +554 to +555
except Exception:
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The try...except Exception block silently ignores errors when fetching package versions. While this prevents the application from crashing during import, it can hide underlying environment issues, such as a corrupted installation. Logging a warning here would provide valuable feedback to the user that the compatibility check was skipped and why.

Suggested change
except Exception:
return
except Exception as e:
logger.warning(f"Unsloth: Could not determine torch/torchvision versions, skipping compatibility check. Error: {e}")
return

Comment thread unsloth/import_fixes.py
Comment on lines +560 to +561
except Exception:
return
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Similar to the previous block, this try...except silently fails when parsing version strings. This can make it difficult to diagnose problems related to malformed version numbers. Logging a warning that includes the problematic version strings would greatly improve debuggability for users.

    except Exception as e:
        logger.warning(
            f"Unsloth: Could not parse torch/torchvision versions, skipping compatibility check. "
            f"Versions: torch='{torch_version_raw}', torchvision='{torchvision_version_raw}'. Error: {e}"
        )
        return

Comment thread unsloth/import_fixes.py

# Detect nightly/dev/alpha/beta/rc builds from the raw version string.
# These often have version mismatches that are expected.
_pre_tags = (".dev", "a0", "b0", "rc", "alpha", "beta", "nightly")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The _pre_tags tuple is a constant collection of strings. It's a good practice to define such constants at the module level (e.g., as _TORCHVISION_PRE_RELEASE_TAGS) rather than inside a function. This improves readability and maintainability by making it clear that this is a fixed set of values used for the check.

@danielhanchen danielhanchen merged commit 5b64813 into main Feb 4, 2026
4 checks passed
@danielhanchen danielhanchen deleted the fix-torchvision-compat-check branch February 4, 2026 12:50
abiswas-realadvice pushed a commit to abiswas-realadvice/unsloth that referenced this pull request May 14, 2026
…h versions (unslothai#3978)

* Fix torchvision compatibility check for source builds and future torch versions

The torchvision version check raised a hard ImportError for custom/source-built
PyTorch installations (e.g. AMD ROCm from source with +git* suffixes), even when
the actual build was functional. This also silently skipped any torch version
not already in the hardcoded table, giving no warning at all for future releases.

Changes:
- Detect custom/source builds by checking the raw version string's local
  identifier against known standard prefixes (cu, rocm, cpu, xpu). Our custom
  Version() strips local identifiers via regex, so detection must happen on the
  raw string before parsing.
- Downgrade to a warning (instead of ImportError) for custom/source builds,
  since their version numbers may not follow standard PyPI release pairings.
- Add formula-based inference for future torch versions not yet in the table.
  The torch->torchvision minor version formula (torch 2.x -> tv 0.(x+15)) has
  held for every release from torch 2.0 through 2.9. For formula-predicted
  versions, mismatches produce a warning rather than a hard error.
- Add UNSLOTH_SKIP_TORCHVISION_CHECK=1 env var to skip the check entirely.
- Wrap importlib_version and Version calls in try/except so broken metadata
  never crashes the import.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review: stricter regex, case insensitivity, pre-release detection

Fixes three edge cases found during review:

1. Regex precision: cu/xpu now require a trailing digit (cu\d, xpu\d) to
   avoid false negatives on suffixes like "+custom_build" that happen to
   start with "cu". cpu/xpu match as exact strings only.

2. Case insensitivity: added re.IGNORECASE so "+ROCM6.3" and "+CPU" are
   correctly recognized as standard builds rather than custom ones.

3. Pre-release detection: nightly/dev/alpha/beta/rc builds with standard
   CUDA/ROCm suffixes (e.g. "2.7.0.dev20250301+cu124") now produce a
   warning instead of a hard ImportError. These builds commonly have
   version mismatches that are expected during development.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address PR review comments: fullmatch, env var casing, torchvision pre-release

1. Switch re.match to re.fullmatch for the custom build regex so the
   entire local identifier must match. Fixes false negatives where
   suffixes like +cu124_custom were misclassified as standard because
   re.match only checked the start of the string.

2. Use .lower() for the UNSLOTH_SKIP_TORCHVISION_CHECK env var so
   any casing of "true" / "TRUE" / etc. is accepted.

3. Check torchvision_version_raw for pre-release tags in addition to
   torch_version_raw, so a stable torch paired with a nightly
   torchvision (e.g. 0.23.0.dev...) also gets a warning instead of
   a hard ImportError.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

---------

Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant